A Fractal Approach for Selecting an Appropriate Bin Size for Cell-Based Diversity Estimation
نویسندگان
چکیده
A novel approach for selecting an appropriate bin size for cell-based diversity assessment is presented. The method measures the sensitivity of the diversity index as a function of grid resolution, using a box-counting algorithm that is reminiscent of those used in fractal analysis. It is shown that the relative variance of the diversity score (sum of squared cell occupancies) of several commonly used molecular descriptor sets exhibits a bell-shaped distribution, whose exact characteristics depend on the distribution of the data set, the number of points considered, and the dimensionality of the feature space. The peak of this distribution represents the optimal bin size for a given data set and sample size. Although box counting can be performed in an algorithmically efficient manner, the ability of cell-based methods to distinguish between subsets of different spread falls sharply with dimensionality, and the method becomes useless beyond a few dimensions.
منابع مشابه
کارآیی هشت مدل ریاضی در توصیف اندازه ذرات در برخی خاکهای استان چهارمحال و بختیاری
Selecting an appropriate particle size distribution (PSD) model for a particular soil may be important for a precise estimation of soil hydraulic properties. Various models have been proposed for describing soil PSDs. The objective of this study was to compare the quality of fitting of eight PSD models (Fredlund, Gompertz, van Genuchten, Jaki, Logarithmic, Exponential, Logarithmic-Exponential a...
متن کاملAssessment of protected vs. degraded oak forests: A geostatistical approach based on soil and plant diversity
Assessment of forest soil and vegetation characteristics provides basic and essential information for the protection and rehabilitation measures in forest ecosystems. Therefore, regard to the importance of this issue, the distribution of different soil properties and vegetation diversity in relation to conservation management and degradation investigated in the oak forests of Ilam province usin...
متن کاملThe effect of estimation methods on fractal modeling for anomalies’ detection in the Irankuh area, Central Iran
This study aims to recognize effect of Ordinary Kriging (OK) and Inverse Distance Weighted (IDW) estimation methods for separation of geochemical anomalies based on soil samples using Concentration-Area (C-A) fractal model in Irankuh area, central Iran. Variograms and anisotropic ellipsoid were generated for the Pb and Zn distribution. Thresholds values from the C-A log-log plots based on the e...
متن کاملDASTWAR: a tool for completeness estimation in magnitude-size plane
Today, great observatories around the world, devote a substantial amount of observing time to sky surveys. The resulted images are inputs of source finder modules. These modules search for the target objects and provide us with source catalogues. We sought to quantify the ability of detection tools in recovering faint galaxies regularly encountered in deep surveys. Our approach was based on com...
متن کاملResources classification using fractal modelling in Eastern Kahang Cu-Mo porphyry deposit, Central Iran
Resources/reserves classification is crucial for block model creation utilised in mine planning and feasibility study. Selection of estimation methods is an essential part of mineral exploration and mining activities. In other word, resources classification is an issue for mining companies, investors, financial institutions and authorities, but it remains subject to some confusion. The aim of t...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Journal of chemical information and computer sciences
دوره 42 1 شماره
صفحات -
تاریخ انتشار 2002